The University of Chicago on Semi-supervised Kernel Methods a Dissertation Submitted to the Faculty of the Division of the Physical Sciences in Candidacy for the Degree of Doctor of Philosophy Department of Computer Science by Vikas Sindhwani

ثبت نشده
چکیده

Semi-supervised learning is an emerging computational paradigm for learning from limited supervision by utilizing large amounts of inexpensive, unsupervised observations. Not only does this paradigm carry appeal as a model for natural learning, but it also has an increasing practical need in most if not all applications of machine learning – those where abundant amounts of data can be cheaply and automatically collected but manual labeling for the purposes of training learning algorithms is often slow, expensive, and error-prone. In this thesis, we develop families of algorithms for semi-supervised inference. These algorithms are based on intuitions about the natural structure and geometry of probability distributions that underlie typical datasets for learning. The classical framework of Regularization in Reproducing Kernel Hilbert Spaces (which is the basis of state-of-the-art supervised algorithms such as SVMs) is extended in several ways to utilize unlabeled data. These extensions are embodied in the following contributions: (1) Manifold Regularization is based on the assumption that high-dimensional data truly resides on low-dimensional manifolds. Ambient globally-defined kernels are combined with the intrinsic Laplacian regularizer to develop new kernels which immediately turn standard supervised kernel methods into semi-supervised learners. An outstanding problem of out-of-sample extension in graph transductive methods is resolved in this framework. (2) Low-density Methods bias learning so that data clusters are protected from being cut by decision boundaries at the expense of turning regularization objectives into non-convex functionals. We analyze the nature of this non-convexity and propose deterministic annealing techniques to overcome local minima.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007